Overview

Dataset statistics

Number of variables21
Number of observations179078
Missing cells513118
Missing cells (%)13.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory28.7 MiB
Average record size in memory168.0 B

Variable types

Numeric8
Categorical13

Alerts

batsman has a high cardinality: 516 distinct values High cardinality
non_striker has a high cardinality: 511 distinct values High cardinality
bowler has a high cardinality: 405 distinct values High cardinality
player_dismissed has a high cardinality: 487 distinct values High cardinality
fielder has a high cardinality: 499 distinct values High cardinality
wide_runs is highly correlated with extra_runsHigh correlation
legbye_runs is highly correlated with extra_runsHigh correlation
batsman_runs is highly correlated with total_runsHigh correlation
extra_runs is highly correlated with wide_runs and 1 other fieldsHigh correlation
total_runs is highly correlated with batsman_runsHigh correlation
wide_runs is highly correlated with extra_runsHigh correlation
legbye_runs is highly correlated with extra_runsHigh correlation
batsman_runs is highly correlated with total_runsHigh correlation
extra_runs is highly correlated with wide_runs and 1 other fieldsHigh correlation
total_runs is highly correlated with batsman_runsHigh correlation
wide_runs is highly correlated with extra_runsHigh correlation
legbye_runs is highly correlated with extra_runsHigh correlation
batsman_runs is highly correlated with total_runsHigh correlation
extra_runs is highly correlated with wide_runs and 1 other fieldsHigh correlation
total_runs is highly correlated with batsman_runsHigh correlation
is_super_over is highly correlated with inningHigh correlation
bye_runs is highly correlated with dismissal_kindHigh correlation
penalty_runs is highly correlated with dismissal_kindHigh correlation
inning is highly correlated with is_super_overHigh correlation
dismissal_kind is highly correlated with bye_runs and 1 other fieldsHigh correlation
inning is highly correlated with is_super_overHigh correlation
is_super_over is highly correlated with inningHigh correlation
wide_runs is highly correlated with extra_runs and 1 other fieldsHigh correlation
bye_runs is highly correlated with extra_runsHigh correlation
legbye_runs is highly correlated with extra_runsHigh correlation
penalty_runs is highly correlated with extra_runsHigh correlation
batsman_runs is highly correlated with total_runs and 1 other fieldsHigh correlation
extra_runs is highly correlated with wide_runs and 4 other fieldsHigh correlation
total_runs is highly correlated with wide_runs and 3 other fieldsHigh correlation
dismissal_kind is highly correlated with batsman_runs and 1 other fieldsHigh correlation
player_dismissed has 170244 (95.1%) missing values Missing
dismissal_kind has 170244 (95.1%) missing values Missing
fielder has 172630 (96.4%) missing values Missing
wide_runs has 173673 (97.0%) zeros Zeros
legbye_runs has 176141 (98.4%) zeros Zeros
batsman_runs has 70845 (39.6%) zeros Zeros
extra_runs has 169541 (94.7%) zeros Zeros
total_runs has 63002 (35.2%) zeros Zeros

Reproduction

Analysis started2021-11-02 07:41:54.568008
Analysis finished2021-11-02 07:42:39.701862
Duration45.13 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

match_id
Real number (ℝ≥0)

Distinct756
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1802.252957
Minimum1
Maximum11415
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2021-11-02T13:12:39.844130image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile38
Q1190
median379
Q3567
95-th percentile11314
Maximum11415
Range11414
Interquartile range (IQR)377

Descriptive statistics

Standard deviation3472.322805
Coefficient of variation (CV)1.926656739
Kurtosis2.245787145
Mean1802.252957
Median Absolute Deviation (MAD)188
Skewness1.996380528
Sum322743855
Variance12057025.66
MonotonicityIncreasing
2021-11-02T13:12:40.078487image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126267
 
0.1%
34263
 
0.1%
534262
 
0.1%
476262
 
0.1%
388261
 
0.1%
190259
 
0.1%
570259
 
0.1%
536258
 
0.1%
401258
 
0.1%
257257
 
0.1%
Other values (746)176472
98.5%
ValueCountFrequency (%)
1248
0.1%
2247
0.1%
3218
0.1%
4247
0.1%
5248
0.1%
6216
0.1%
7254
0.1%
8212
0.1%
9226
0.1%
10239
0.1%
ValueCountFrequency (%)
11415248
0.1%
11414239
0.1%
11413252
0.1%
11412237
0.1%
11347228
0.1%
11346235
0.1%
11345246
0.1%
11344224
0.1%
11343234
0.1%
11342249
0.1%

inning
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
1
92742 
2
86240 
3
 
50
4
 
38
5
 
8

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
192742
51.8%
286240
48.2%
350
 
< 0.1%
438
 
< 0.1%
58
 
< 0.1%

Length

2021-11-02T13:12:40.407215image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-02T13:12:40.509336image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
192742
51.8%
286240
48.2%
350
 
< 0.1%
438
 
< 0.1%
58
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

batting_team
Categorical

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
Mumbai Indians
22619 
Kings XI Punjab
20931 
Royal Challengers Bangalore
20908 
Kolkata Knight Riders
20858 
Chennai Super Kings
19762 
Other values (9)
74000 

Length

Max length27
Median length16
Mean length17.99314265
Min length13

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSunrisers Hyderabad
2nd rowSunrisers Hyderabad
3rd rowSunrisers Hyderabad
4th rowSunrisers Hyderabad
5th rowSunrisers Hyderabad

Common Values

ValueCountFrequency (%)
Mumbai Indians22619
12.6%
Kings XI Punjab20931
11.7%
Royal Challengers Bangalore20908
11.7%
Kolkata Knight Riders20858
11.6%
Chennai Super Kings19762
11.0%
Delhi Daredevils18786
10.5%
Rajasthan Royals17292
9.7%
Sunrisers Hyderabad12908
7.2%
Deccan Chargers9034
 
5.0%
Pune Warriors5443
 
3.0%
Other values (4)10537
5.9%

Length

2021-11-02T13:12:40.724960image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
kings40693
 
9.1%
mumbai22619
 
5.1%
indians22619
 
5.1%
xi20931
 
4.7%
punjab20931
 
4.7%
royal20908
 
4.7%
challengers20908
 
4.7%
bangalore20908
 
4.7%
kolkata20858
 
4.7%
knight20858
 
4.7%
Other values (21)213444
47.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

bowling_team
Categorical

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
Mumbai Indians
22517 
Royal Challengers Bangalore
21236 
Kolkata Knight Riders
20940 
Kings XI Punjab
20782 
Chennai Super Kings
19556 
Other values (9)
74047 

Length

Max length27
Median length16
Mean length18.01460258
Min length13

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRoyal Challengers Bangalore
2nd rowRoyal Challengers Bangalore
3rd rowRoyal Challengers Bangalore
4th rowRoyal Challengers Bangalore
5th rowRoyal Challengers Bangalore

Common Values

ValueCountFrequency (%)
Mumbai Indians22517
12.6%
Royal Challengers Bangalore21236
11.9%
Kolkata Knight Riders20940
11.7%
Kings XI Punjab20782
11.6%
Chennai Super Kings19556
10.9%
Delhi Daredevils18725
10.5%
Rajasthan Royals17382
9.7%
Sunrisers Hyderabad12779
7.1%
Deccan Chargers9039
5.0%
Pune Warriors5457
 
3.0%
Other values (4)10665
6.0%

Length

2021-11-02T13:12:40.910314image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
kings40338
 
9.0%
mumbai22517
 
5.1%
indians22517
 
5.1%
royal21236
 
4.8%
challengers21236
 
4.8%
bangalore21236
 
4.8%
kolkata20940
 
4.7%
knight20940
 
4.7%
riders20940
 
4.7%
xi20782
 
4.7%
Other values (21)213145
47.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

over
Real number (ℝ≥0)

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.16248785
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2021-11-02T13:12:41.097805image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q15
median10
Q315
95-th percentile19
Maximum20
Range19
Interquartile range (IQR)10

Descriptive statistics

Standard deviation5.677684313
Coefficient of variation (CV)0.558690391
Kurtosis-1.183356367
Mean10.16248785
Median Absolute Deviation (MAD)5
Skewness0.04901758304
Sum1819878
Variance32.23609916
MonotonicityNot monotonic
2021-11-02T13:12:41.363431image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
19603
 
5.4%
29498
 
5.3%
39415
 
5.3%
49379
 
5.2%
59345
 
5.2%
69326
 
5.2%
79283
 
5.2%
89253
 
5.2%
99231
 
5.2%
109184
 
5.1%
Other values (10)85561
47.8%
ValueCountFrequency (%)
19603
5.4%
29498
5.3%
39415
5.3%
49379
5.2%
59345
5.2%
69326
5.2%
79283
5.2%
89253
5.2%
99231
5.2%
109184
5.1%
ValueCountFrequency (%)
206738
3.8%
197866
4.4%
188387
4.7%
178648
4.8%
168761
4.9%
158900
5.0%
148978
5.0%
139073
5.1%
129090
5.1%
119120
5.1%

ball
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.615586504
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2021-11-02T13:12:41.722775image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q35
95-th percentile6
Maximum9
Range8
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.806965975
Coefficient of variation (CV)0.4997711915
Kurtosis-1.083107949
Mean3.615586504
Median Absolute Deviation (MAD)2
Skewness0.09612230007
Sum647472
Variance3.265126035
MonotonicityNot monotonic
2021-11-02T13:12:41.847744image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
129047
16.2%
228963
16.2%
328878
16.1%
428812
16.1%
528720
16.0%
628628
16.0%
75113
 
2.9%
8795
 
0.4%
9122
 
0.1%
ValueCountFrequency (%)
129047
16.2%
228963
16.2%
328878
16.1%
428812
16.1%
528720
16.0%
628628
16.0%
75113
 
2.9%
8795
 
0.4%
9122
 
0.1%
ValueCountFrequency (%)
9122
 
0.1%
8795
 
0.4%
75113
 
2.9%
628628
16.0%
528720
16.0%
428812
16.1%
328878
16.1%
228963
16.2%
129047
16.2%

batsman
Categorical

HIGH CARDINALITY

Distinct516
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
V Kohli
 
4211
SK Raina
 
4044
RG Sharma
 
3816
S Dhawan
 
3776
G Gambhir
 
3524
Other values (511)
159707 

Length

Max length20
Median length9
Mean length9.318967154
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)< 0.1%

Sample

1st rowDA Warner
2nd rowDA Warner
3rd rowDA Warner
4th rowDA Warner
5th rowDA Warner

Common Values

ValueCountFrequency (%)
V Kohli4211
 
2.4%
SK Raina4044
 
2.3%
RG Sharma3816
 
2.1%
S Dhawan3776
 
2.1%
G Gambhir3524
 
2.0%
RV Uthappa3492
 
1.9%
DA Warner3398
 
1.9%
MS Dhoni3318
 
1.9%
AM Rahane3215
 
1.8%
CH Gayle3131
 
1.7%
Other values (506)143153
79.9%

Length

2021-11-02T13:12:42.064848image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
s6778
 
1.8%
v6474
 
1.8%
singh4936
 
1.3%
da4774
 
1.3%
sr4683
 
1.3%
sharma4675
 
1.3%
m4395
 
1.2%
de4367
 
1.2%
sk4324
 
1.2%
kohli4231
 
1.2%
Other values (686)317432
86.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

non_striker
Categorical

HIGH CARDINALITY

Distinct511
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
SK Raina
 
4173
S Dhawan
 
4090
V Kohli
 
4071
RG Sharma
 
3858
G Gambhir
 
3740
Other values (506)
159146 

Length

Max length20
Median length9
Mean length9.320647986
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowS Dhawan
2nd rowS Dhawan
3rd rowS Dhawan
4th rowS Dhawan
5th rowS Dhawan

Common Values

ValueCountFrequency (%)
SK Raina4173
 
2.3%
S Dhawan4090
 
2.3%
V Kohli4071
 
2.3%
RG Sharma3858
 
2.2%
G Gambhir3740
 
2.1%
AM Rahane3467
 
1.9%
RV Uthappa3381
 
1.9%
DA Warner3127
 
1.7%
CH Gayle3023
 
1.7%
AB de Villiers2996
 
1.7%
Other values (501)143152
79.9%

Length

2021-11-02T13:12:42.305760image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
s7043
 
1.9%
v6487
 
1.8%
sr4897
 
1.3%
sharma4806
 
1.3%
singh4695
 
1.3%
m4518
 
1.2%
da4491
 
1.2%
sk4423
 
1.2%
de4315
 
1.2%
dhawan4209
 
1.1%
Other values (684)317210
86.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

bowler
Categorical

HIGH CARDINALITY

Distinct405
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
Harbhajan Singh
 
3451
A Mishra
 
3172
PP Chawla
 
3157
R Ashwin
 
3016
SL Malinga
 
2974
Other values (400)
163308 

Length

Max length17
Median length9
Mean length9.464836552
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowTS Mills
2nd rowTS Mills
3rd rowTS Mills
4th rowTS Mills
5th rowTS Mills

Common Values

ValueCountFrequency (%)
Harbhajan Singh3451
 
1.9%
A Mishra3172
 
1.8%
PP Chawla3157
 
1.8%
R Ashwin3016
 
1.7%
SL Malinga2974
 
1.7%
DJ Bravo2711
 
1.5%
B Kumar2707
 
1.5%
P Kumar2637
 
1.5%
UT Yadav2605
 
1.5%
SP Narine2600
 
1.5%
Other values (395)150048
83.8%

Length

2021-11-02T13:12:42.508759image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
r9707
 
2.7%
singh9243
 
2.5%
sharma9188
 
2.5%
a8586
 
2.4%
kumar7561
 
2.1%
s6896
 
1.9%
m6348
 
1.7%
p5150
 
1.4%
pp5102
 
1.4%
b4200
 
1.2%
Other values (559)292640
80.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

is_super_over
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
0
178997 
1
 
81

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0178997
> 99.9%
181
 
< 0.1%

Length

2021-11-02T13:12:42.698298image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-02T13:12:42.816937image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0178997
> 99.9%
181
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

wide_runs
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.03672142865
Minimum0
Maximum5
Zeros173673
Zeros (%)97.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2021-11-02T13:12:42.926749image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2511611312
Coefficient of variation (CV)6.839633981
Kurtosis191.6858792
Mean0.03672142865
Median Absolute Deviation (MAD)0
Skewness11.66307776
Sum6576
Variance0.06308191384
MonotonicityNot monotonic
2021-11-02T13:12:43.119011image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0173673
97.0%
14915
 
2.7%
2230
 
0.1%
5208
 
0.1%
347
 
< 0.1%
45
 
< 0.1%
ValueCountFrequency (%)
0173673
97.0%
14915
 
2.7%
2230
 
0.1%
347
 
< 0.1%
45
 
< 0.1%
5208
 
0.1%
ValueCountFrequency (%)
5208
 
0.1%
45
 
< 0.1%
347
 
< 0.1%
2230
 
0.1%
14915
 
2.7%
0173673
97.0%

bye_runs
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
0
178598 
1
 
324
4
 
123
2
 
31
3
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0178598
99.7%
1324
 
0.2%
4123
 
0.1%
231
 
< 0.1%
32
 
< 0.1%

Length

2021-11-02T13:12:43.353365image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-02T13:12:43.447108image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0178598
99.7%
1324
 
0.2%
4123
 
0.1%
231
 
< 0.1%
32
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

legbye_runs
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.02113604128
Minimum0
Maximum5
Zeros176141
Zeros (%)98.4%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2021-11-02T13:12:43.777322image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.1949082998
Coefficient of variation (CV)9.221608588
Kurtosis242.3265243
Mean0.02113604128
Median Absolute Deviation (MAD)0
Skewness13.77728696
Sum3785
Variance0.03798924532
MonotonicityNot monotonic
2021-11-02T13:12:44.128616image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0176141
98.4%
12558
 
1.4%
4220
 
0.1%
2138
 
0.1%
317
 
< 0.1%
54
 
< 0.1%
ValueCountFrequency (%)
0176141
98.4%
12558
 
1.4%
2138
 
0.1%
317
 
< 0.1%
4220
 
0.1%
54
 
< 0.1%
ValueCountFrequency (%)
54
 
< 0.1%
4220
 
0.1%
317
 
< 0.1%
2138
 
0.1%
12558
 
1.4%
0176141
98.4%

noball_runs
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
0
178364 
1
 
698
2
 
9
5
 
6
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0178364
99.6%
1698
 
0.4%
29
 
< 0.1%
56
 
< 0.1%
31
 
< 0.1%

Length

2021-11-02T13:12:44.362968image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-02T13:12:44.472337image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0178364
99.6%
1698
 
0.4%
29
 
< 0.1%
56
 
< 0.1%
31
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

penalty_runs
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
0
179076 
5
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0179076
> 99.9%
52
 
< 0.1%

Length

2021-11-02T13:12:44.623758image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-02T13:12:45.133296image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0179076
> 99.9%
52
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

batsman_runs
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.246864495
Minimum0
Maximum7
Zeros70845
Zeros (%)39.6%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2021-11-02T13:12:45.283328image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile4
Maximum7
Range7
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.608270266
Coefficient of variation (CV)1.289851682
Kurtosis1.632692902
Mean1.246864495
Median Absolute Deviation (MAD)1
Skewness1.582522721
Sum223286
Variance2.586533248
MonotonicityNot monotonic
2021-11-02T13:12:45.455192image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
070845
39.6%
167523
37.7%
420392
 
11.4%
211471
 
6.4%
68170
 
4.6%
3587
 
0.3%
579
 
< 0.1%
711
 
< 0.1%
ValueCountFrequency (%)
070845
39.6%
167523
37.7%
211471
 
6.4%
3587
 
0.3%
420392
 
11.4%
579
 
< 0.1%
68170
 
4.6%
711
 
< 0.1%
ValueCountFrequency (%)
711
 
< 0.1%
68170
 
4.6%
579
 
< 0.1%
420392
 
11.4%
3587
 
0.3%
211471
 
6.4%
167523
37.7%
070845
39.6%

extra_runs
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.06703224293
Minimum0
Maximum7
Zeros169541
Zeros (%)94.7%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2021-11-02T13:12:45.630908image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.3425529326
Coefficient of variation (CV)5.110271082
Kurtosis91.227968
Mean0.06703224293
Median Absolute Deviation (MAD)0
Skewness8.234162663
Sum12004
Variance0.1173425116
MonotonicityNot monotonic
2021-11-02T13:12:45.803774image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0169541
94.7%
18495
 
4.7%
2407
 
0.2%
4348
 
0.2%
5219
 
0.1%
367
 
< 0.1%
71
 
< 0.1%
ValueCountFrequency (%)
0169541
94.7%
18495
 
4.7%
2407
 
0.2%
367
 
< 0.1%
4348
 
0.2%
5219
 
0.1%
71
 
< 0.1%
ValueCountFrequency (%)
71
 
< 0.1%
5219
 
0.1%
4348
 
0.2%
367
 
< 0.1%
2407
 
0.2%
18495
 
4.7%
0169541
94.7%

total_runs
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.313896738
Minimum0
Maximum10
Zeros63002
Zeros (%)35.2%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2021-11-02T13:12:45.960027image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile4
Maximum10
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.605421643
Coefficient of variation (CV)1.221878095
Kurtosis1.640138506
Mean1.313896738
Median Absolute Deviation (MAD)1
Skewness1.556932828
Sum235290
Variance2.57737865
MonotonicityNot monotonic
2021-11-02T13:12:46.085017image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
173059
40.8%
063002
35.2%
420599
 
11.5%
213125
 
7.3%
68148
 
4.5%
3688
 
0.4%
5339
 
0.2%
864
 
< 0.1%
738
 
< 0.1%
1016
 
< 0.1%
ValueCountFrequency (%)
063002
35.2%
173059
40.8%
213125
 
7.3%
3688
 
0.4%
420599
 
11.5%
5339
 
0.2%
68148
 
4.5%
738
 
< 0.1%
864
 
< 0.1%
1016
 
< 0.1%
ValueCountFrequency (%)
1016
 
< 0.1%
864
 
< 0.1%
738
 
< 0.1%
68148
 
4.5%
5339
 
0.2%
420599
 
11.5%
3688
 
0.4%
213125
 
7.3%
173059
40.8%
063002
35.2%

player_dismissed
Categorical

HIGH CARDINALITY
MISSING

Distinct487
Distinct (%)5.5%
Missing170244
Missing (%)95.1%
Memory size1.4 MiB
SK Raina
 
162
RG Sharma
 
155
RV Uthappa
 
153
V Kohli
 
143
S Dhawan
 
137
Other values (482)
8084 

Length

Max length20
Median length9
Mean length9.35340729
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique80 ?
Unique (%)0.9%

Sample

1st rowDA Warner
2nd rowS Dhawan
3rd rowMC Henriques
4th rowYuvraj Singh
5th rowMandeep Singh

Common Values

ValueCountFrequency (%)
SK Raina162
 
0.1%
RG Sharma155
 
0.1%
RV Uthappa153
 
0.1%
V Kohli143
 
0.1%
S Dhawan137
 
0.1%
G Gambhir136
 
0.1%
KD Karthik135
 
0.1%
PA Patel126
 
0.1%
AM Rahane116
 
0.1%
AT Rayudu115
 
0.1%
Other values (477)7456
 
4.2%
(Missing)170244
95.1%

Length

2021-11-02T13:12:46.269125image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
singh316
 
1.7%
s311
 
1.7%
v261
 
1.4%
r246
 
1.4%
m241
 
1.3%
sharma237
 
1.3%
sk189
 
1.0%
patel189
 
1.0%
sr184
 
1.0%
de175
 
1.0%
Other values (653)15744
87.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

dismissal_kind
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct9
Distinct (%)0.1%
Missing170244
Missing (%)95.1%
Memory size1.4 MiB
caught
5348 
bowled
1581 
run out
852 
lbw
540 
stumped
 
278
Other values (4)
 
235

Length

Max length21
Median length6
Mean length6.223341635
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcaught
2nd rowcaught
3rd rowcaught
4th rowbowled
5th rowbowled

Common Values

ValueCountFrequency (%)
caught5348
 
3.0%
bowled1581
 
0.9%
run out852
 
0.5%
lbw540
 
0.3%
stumped278
 
0.2%
caught and bowled211
 
0.1%
retired hurt12
 
< 0.1%
hit wicket10
 
< 0.1%
obstructing the field2
 
< 0.1%
(Missing)170244
95.1%

Length

2021-11-02T13:12:46.487869image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-02T13:12:46.864977image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
caught5559
54.9%
bowled1792
 
17.7%
run852
 
8.4%
out852
 
8.4%
lbw540
 
5.3%
stumped278
 
2.7%
and211
 
2.1%
retired12
 
0.1%
hurt12
 
0.1%
hit10
 
0.1%
Other values (4)16
 
0.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

fielder
Categorical

HIGH CARDINALITY
MISSING

Distinct499
Distinct (%)7.7%
Missing172630
Missing (%)96.4%
Memory size1.4 MiB
MS Dhoni
 
159
KD Karthik
 
152
RV Uthappa
 
125
SK Raina
 
115
AB de Villiers
 
114
Other values (494)
5783 

Length

Max length21
Median length9
Mean length9.462779156
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique93 ?
Unique (%)1.4%

Sample

1st rowMandeep Singh
2nd rowSachin Baby
3rd rowSachin Baby
4th rowDA Warner
5th rowBCJ Cutting

Common Values

ValueCountFrequency (%)
MS Dhoni159
 
0.1%
KD Karthik152
 
0.1%
RV Uthappa125
 
0.1%
SK Raina115
 
0.1%
AB de Villiers114
 
0.1%
PA Patel97
 
0.1%
RG Sharma92
 
0.1%
V Kohli90
 
0.1%
KA Pollard85
 
< 0.1%
WP Saha82
 
< 0.1%
Other values (489)5337
 
3.0%
(Missing)172630
96.4%

Length

2021-11-02T13:12:47.382627image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
singh204
 
1.5%
r202
 
1.5%
s198
 
1.5%
ms194
 
1.5%
m192
 
1.4%
sharma188
 
1.4%
de169
 
1.3%
karthik166
 
1.2%
patel164
 
1.2%
dhoni159
 
1.2%
Other values (618)11488
86.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2021-11-02T13:12:32.385655image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:11.521448image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:14.388761image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:17.210506image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:21.326405image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:24.096310image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:26.856926image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:29.475581image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:32.660575image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:11.799772image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:14.644722image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:17.498222image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:21.613660image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:24.332670image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:27.160878image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:29.895701image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:33.289700image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:12.065336image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:15.038853image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:17.915843image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:21.901020image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:24.777552image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:27.528843image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:30.161203image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:33.543576image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:12.323124image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:15.412057image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:18.298626image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:22.363485image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:25.213741image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:27.896827image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:30.429363image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:33.842465image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:12.881029image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:15.665324image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:18.799100image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:22.802406image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:25.561527image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:28.216797image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:30.663270image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:34.401083image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:13.215058image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:16.065345image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:19.481087image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:23.082017image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:25.845869image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:28.575259image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:31.128300image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:34.944175image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:13.474565image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:16.364090image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:20.031820image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:23.381894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:26.180126image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:28.854980image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:31.394231image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:35.522259image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:13.997369image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:16.896325image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:20.890169image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:23.814170image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:26.494626image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:29.158389image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-02T13:12:31.879659image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-11-02T13:12:47.806321image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-11-02T13:12:48.358179image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-11-02T13:12:48.796049image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-11-02T13:12:49.208870image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2021-11-02T13:12:49.545338image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-11-02T13:12:36.094336image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-11-02T13:12:37.758160image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-11-02T13:12:38.744271image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-11-02T13:12:39.098253image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

match_idinningbatting_teambowling_teamoverballbatsmannon_strikerbowleris_super_overwide_runsbye_runslegbye_runsnoball_runspenalty_runsbatsman_runsextra_runstotal_runsplayer_dismisseddismissal_kindfielder
011Sunrisers HyderabadRoyal Challengers Bangalore11DA WarnerS DhawanTS Mills000000000NaNNaNNaN
111Sunrisers HyderabadRoyal Challengers Bangalore12DA WarnerS DhawanTS Mills000000000NaNNaNNaN
211Sunrisers HyderabadRoyal Challengers Bangalore13DA WarnerS DhawanTS Mills000000404NaNNaNNaN
311Sunrisers HyderabadRoyal Challengers Bangalore14DA WarnerS DhawanTS Mills000000000NaNNaNNaN
411Sunrisers HyderabadRoyal Challengers Bangalore15DA WarnerS DhawanTS Mills020000022NaNNaNNaN
511Sunrisers HyderabadRoyal Challengers Bangalore16S DhawanDA WarnerTS Mills000000000NaNNaNNaN
611Sunrisers HyderabadRoyal Challengers Bangalore17S DhawanDA WarnerTS Mills000100011NaNNaNNaN
711Sunrisers HyderabadRoyal Challengers Bangalore21S DhawanDA WarnerA Choudhary000000101NaNNaNNaN
811Sunrisers HyderabadRoyal Challengers Bangalore22DA WarnerS DhawanA Choudhary000000404NaNNaNNaN
911Sunrisers HyderabadRoyal Challengers Bangalore23DA WarnerS DhawanA Choudhary000010011NaNNaNNaN

Last rows

match_idinningbatting_teambowling_teamoverballbatsmannon_strikerbowleris_super_overwide_runsbye_runslegbye_runsnoball_runspenalty_runsbatsman_runsextra_runstotal_runsplayer_dismisseddismissal_kindfielder
179068114152Chennai Super KingsMumbai Indians193RA JadejaSR WatsonJJ Bumrah000000202NaNNaNNaN
179069114152Chennai Super KingsMumbai Indians194RA JadejaSR WatsonJJ Bumrah000000000NaNNaNNaN
179070114152Chennai Super KingsMumbai Indians195RA JadejaSR WatsonJJ Bumrah000000202NaNNaNNaN
179071114152Chennai Super KingsMumbai Indians196RA JadejaSR WatsonJJ Bumrah004000448NaNNaNNaN
179072114152Chennai Super KingsMumbai Indians201SR WatsonRA JadejaSL Malinga000000101NaNNaNNaN
179073114152Chennai Super KingsMumbai Indians202RA JadejaSR WatsonSL Malinga000000101NaNNaNNaN
179074114152Chennai Super KingsMumbai Indians203SR WatsonRA JadejaSL Malinga000000202NaNNaNNaN
179075114152Chennai Super KingsMumbai Indians204SR WatsonRA JadejaSL Malinga000000101SR Watsonrun outKH Pandya
179076114152Chennai Super KingsMumbai Indians205SN ThakurRA JadejaSL Malinga000000202NaNNaNNaN
179077114152Chennai Super KingsMumbai Indians206SN ThakurRA JadejaSL Malinga000000000SN ThakurlbwNaN